-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: threshold optimizer with relaxed fairness constraint fulfillment #1248
WIP: threshold optimizer with relaxed fairness constraint fulfillment #1248
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do realize this is still WIP but I thought I'd share some early thoughts 🙂
We'll also need to think about compatibility with the existing plotting code:
def plot_threshold_optimizer(threshold_optimizer, ax=None, show_plot=True): |
Are there any reasonable plots you generate for the relaxed version?
@@ -18,7 +18,7 @@ class ThresholdOperation: | |||
""" | |||
|
|||
def __init__(self, operator, threshold): | |||
if operator not in [">", "<"]: | |||
if operator not in [">", "<"]: # NOTE for PR: sklearn uses >= for ROC threshold; see: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! The way we use it, it shouldn't really be crucial if it's >= or >. We can probably change it. The thresholds are always chosen between the input scores, e.g., between 0.5 and 0.6 we choose 0.55 so they're meant to be strictly on one side or the other. I would have to make sure that it doesn't change anything if we want to make a change here, though. Does this relate to your PR in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the thresholds in sklearn are the predicted scores for one or more instances (see code here), so using >
instead of >=
would mean flipping the predictions for some portion of samples.
If it's possible to change it here that would be great!
If not, is there any fairlearn code to build the (fpr, tpr, threshold) ROC triplets? I could switch to using that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at _tradeoff_curve
here:
https://github.com/fairlearn/fairlearn/blob/main/fairlearn/postprocessing/_tradeoff_curve_utilities.py
|
||
Parameters | ||
---------- | ||
estimator : object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you assuming that it's pre-fit (trained) or not? estimator
kind of implies that it's not, while predictor (in sklearn) means it is pre-fit. We have some logic and an argument in ThresholdOptimizer
to check for this:
prefit : bool, default=False |
Obviously, when adding this functionality to the TO class you can take advantage of that and make sure you pass only a trained model in. In that case, you may want to call this predictor (?) but perhaps I'm splitting hairs now.
|
||
# Compute group-wise ROC curves | ||
if y_scores is None: | ||
y_scores = self.estimator(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're doing this a bit differently in TO and I would argue it's preferable:
scores = _get_soft_predictions(self.estimator_, X, self._predict_method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah right, that was still using the previous callable API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, we can of course omit the y_scores
here, it's just here because this is the slowest part of the whole fit
method and oftentimes users already have the predictions computed.
An example is when you want to map the whole fairness-accuracy Pareto frontier, you'd call RelaxedThresholdOptimizer
on a bunch of tolerance values (e.g., np.arange(0, 1, 1e-2)
), and each call would unnecessarily re-compute predictions. This is in fact the only part of the fit method that scales with the size of the dataset, everything else is generally fast on large datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get your point about performance, but I'm not entirely sure why a user would precompute them just to pass them in manually, rather than having the class compute the scores. Or do you mean that they need the scores for other purposes, so they have to compute them outside, so computing them twice feels like a waste?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I don't have a strong opinion on how exactly it is implemented, but I think it would be useful to somehow cache these predictions so we don't have to re-compute them for different levels of tolerance. Given that different values of tolerance will be handled by different class objects, I don't see how it could be cached at the class level (must be cached outside). Example follows:
When mapping the fairness-accuracy Pareto frontier, the user needs to create a different RelaxedThresholdOptimizer
for each tolerance
value, e.g.:
# Compute predictions for models with varying levels of tolerance
def compute_test_predictions_with_relaxed_constraints(tolerance: float, y_scores=None) -> np.ndarray:
# Instantiate
clf = _RelaxedThresholdOptimizer(
predictor=lambda *args, **kwargs: unmitigated_predictor.predict(*args, **kwargs),
predict_method="__call__",
constraint="equalized_odds",
tolerance=tolerance,
)
# NOTE
# 1st Option: will call `predictor.predict(X_train)`
clf.fit(X_train, Y_train, sensitive_features=A_train_np)
# 2nd Option: will *not* call `predictor.predict(X_train)`
clf.fit(X_train, Y_train, sensitive_features=A_train_np, y_scores=y_scores)
return clf.predict(X_test, sensitive_features=A_test_np)
# [For 2nd Option] Pre-compute predictions
Y_train_scores = unmitigated_predictor.predict(X_train)
# Compute predictions at different levels of tolerance
all_model_predictions = {
f"train tolerance={tol:.1}": compute_test_predictions_with_relaxed_constraints(tol, y_scores=Y_train_scores)
for tol in np.arange(0, unmitigated_equalized_odds_diff, 1e-2)
}
Given that the underlying predictor is the same on all of these objects, the clf.fit(.)
will call predictor.predict_proba(.)
repeatedly (and redundantly) for each tolerance value. The 1st option calls the predictor's predict_method
PS: ignore the predict_method="__call__"
mess for now 😅
@@ -0,0 +1,461 @@ | |||
"""Helper functions to construct and use randomized classifiers. | |||
|
|||
TODO: this module will probably be substituted by the InterpolatedThresholder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to make adjustments to InterpolatedThresholder let's do it (unless it's too complicated)
Co-authored-by: Roman Lutz <romanlutz13@gmail.com>
Thanks for the feedback! |
6498c81
to
07ffdb8
Compare
dbde72a
to
b6ada3e
Compare
Regarding the plot, I can add that exact plotting function from Figure 9 as implemented here. For now, I wanted to get the core working and ready to merge, the plotting is somewhat independent of the rest. I'll add the check for when this class is passed to the current plotting function. |
I completely agree. |
Hi @AndreFCruz, any interest in continuing the work on this PR? @romanlutz, what would take for this to be merged? |
hi! I could take a look into what's missing over the next few weeks. |
Thanks! Let me know if I can support the efforts somehow. |
hi @TamaraAtanasoska I'll close the PR for now and re-open it if this changes. |
Description
This PR corresponds to Issue #1246 (discussion is ongoing over there).
General summary: implementing ThresholdOptimizer with relaxed fairness constraint fulfillment.
Opened PR as a Draft to ease code discussions on the ongoing implementation.
Tests
Documentation
TODO
InterpolatedThresholder
instead of our custom RandomizedClassifier implementations.RelaxedThresholdOptimizer
(same or different class asThresholdOptimizer
)Consolidating code with existing code-base
The
_randomized_classifiers.py
module provides two main functionalities:RandomizedClassifier
: Constructing a randomized classifier at a given ROC point. This implies triangulating the target ROC point as a linear combination of realized (deterministic) ROC points in the realized ROC curve.EnsembleGroupwiseClassifiers
: Bringing all group-specific classifiers together under a single classifier object.Part (or all?) of this functionality seems to be covered by the
InterpolatedThresholder
class.Are there any examples of using an
InterpolatedThresholder
object to triangulate an ROC point?What other functionality do you reckon could be duplicated?
@romanlutz @MiroDudik